Skip to content

[FLINK-14925][table] Support precision-aware TO_TIMESTAMP with format-based inference#27793

Open
raminqaf wants to merge 6 commits intoapache:masterfrom
raminqaf:FLINK-14925
Open

[FLINK-14925][table] Support precision-aware TO_TIMESTAMP with format-based inference#27793
raminqaf wants to merge 6 commits intoapache:masterfrom
raminqaf:FLINK-14925

Conversation

@raminqaf
Copy link
Contributor

What is the purpose of the change

This pull request makes the TO_TIMESTAMP function precision-aware when a format pattern is provided. Previously,
TO_TIMESTAMP always returned TIMESTAMP(3) regardless of the format pattern's fractional second precision, which forced users to lose sub-millisecond data. This is the TO_TIMESTAMP counterpart to the TO_TIMESTAMP_LTZ precision support added in FLINK-39244.

The output type for the 1-arg variant remains TIMESTAMP(3) for backward compatibility. For the 2-arg variant, precision is
inferred from the format pattern's trailing S count (e.g., SSSSSSTIMESTAMP(6)), with a minimum of 3.

As part of this change, TO_TIMESTAMP is migrated from the legacy Calcite-native function pattern (FlinkSqlOperatorTable + StringCallGen codegen) to the modern bridging function pattern (BuiltInFunctionDefinition + runtimeClass), matching how
TO_TIMESTAMP_LTZ is implemented. This was made possible by fixing the function name from camelCase "toTimestamp" to "TO_TIMESTAMP", which allows CoreModule to resolve it correctly for SQL queries without needing a separate FlinkSqlOperatorTable entry.

Brief change log

  • Type strategy (ToTimestampTypeStrategy): New output type strategy that returns TIMESTAMP(3) for the 1-arg variant
    and TIMESTAMP(max(sCount, 3)) for the 2-arg variant, where sCount is inferred from the format pattern's trailing S
    characters.
  • Runtime function (ToTimestampFunction): New runtime class with eval(StringData) and eval(StringData, StringData)
    methods. The 2-arg variant passes precisionFromFormat(format) to parseTimestampData for precision-aware parsing.
  • Function definition (BuiltInFunctionDefinitions): Changed name from "toTimestamp" to "TO_TIMESTAMP" (removing the need for explicit sqlName), added runtimeClass, and switched output type strategy to SpecificTypeStrategies.TO_TIMESTAMP.
  • Removed legacy plumbing: Removed FlinkSqlOperatorTable.TO_TIMESTAMP, DirectConvertRule mapping, StringCallGen
    cases, and BuiltInMethods.STRING_TO_TIMESTAMP / STRING_TO_TIMESTAMP_WITH_FORMAT — all superseded by the bridging function mechanism.
  • Documentation: Updated sql_functions.yml, sql_functions_zh.yml, and Python expressions.py / expression.py
    docstrings with precision-dependent output types and examples.

Verifying this change

This change added tests and can be verified as follows:

  • Added type strategy unit tests in ToTimestampTypeStrategyTest covering 1-arg default precision, 2-arg format-based
    precision (SSS/SSSSSS/SSSSSSSSS/no-S), invalid argument types, and argument count validation.
  • Added integration tests in TimeFunctionsITCase for 1-arg truncation to precision 3, 2-arg precision 6/9 from format, SSS format staying at precision 3, fewer input digits than format precision, unparsable string, and null input.
  • Removed redundant legacy tests from TemporalTypesTest.scala that are now covered by the new TimeFunctionsITCase tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no — the ToTimestampFunction.eval() methods call the same DateTimeUtils.parseTimestampData methods as before.
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? docs / JavaDocs / PyDocs

@snuyanzin snuyanzin marked this pull request as ready for review March 20, 2026 08:55
@flinkbot
Copy link
Collaborator

flinkbot commented Mar 20, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@raminqaf raminqaf marked this pull request as draft March 20, 2026 08:58
@raminqaf raminqaf marked this pull request as ready for review March 20, 2026 08:58
Comment on lines +59 to +63
try {
return parseTimestampData(timestamp.toString());
} catch (DateTimeException e) {
return null;
}
Copy link
Contributor Author

@raminqaf raminqaf Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twalthr & @snuyanzin
I have added this because of this test:

https://github.com/raminqaf/flink/blob/afe6ef9fb08016e6d7ef3a60421f7f1b7034dc7d/flink-table/flink-table-planner/src/test/scala/org/apache/flink/table/planner/expressions/TemporalTypesTest.scala#L1009-L1009

Currently for the TO_TIMESTAMP_LTZ('abc') function we are not returning null but throw an exception. Should this be handled to return null? If yes, I can make a followup issue/PR


- string1: the timestamp string to parse
- string2: the format pattern (default 'yyyy-MM-dd HH:mm:ss'). The pattern follows Java's DateTimeFormatter syntax, where 'S' represents fractional seconds (e.g., 'SSS' for milliseconds, 'SSSSSSSSS' for nanoseconds).
- string2: the format pattern (default 'yyyy-MM-dd HH:mm:ss'). The pattern follows Java's [DateTimeFormatter](https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html) syntax, where 'S' represents fractional seconds (e.g., 'SSS' for milliseconds, 'SSSSSSSSS' for nanoseconds).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems not fixed yet

"type" : "TIMESTAMP(3)"
},
"serializableString" : "TO_TIMESTAMP(`c`)"
"serializableString" : "`TO_TIMESTAMP`(`c`)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the reason of such change?
Usually we don't need to quote function names

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants